Download Blackboard system and top-down processing for the transcription of simple polyphonic music
A system is proposed to perform the automatic music transcription of simple polyphonic tracks using top-down processing. It is composed of a blackboard system of three hierarchical levels, receiving its input from a segmentation routine in the form of an averaged STFT matrix. The blackboard contains a hypotheses database, a scheduler and knowledge sources, one of which is a neural network chord recogniser with the ability to reconfigure the operation of the system, allowing it to output more than one note hypothesis at a time. The basic implementation is explained, and some examples are provided to illustrate the performance of the system. The weaknesses of the current implementation are shown and next steps for further development of the system are defined.
Download Monophonic transcription with autocorrelation
This paper describes an algorithm, which performs monophonic music transcription. A pitch tracker calculates the fundamental frequency of the signal from the autocorrelation function. A continuity-restoration block takes the extracted pitch and determines the score corresponding to the original performance. The signal envelope analysis completes the transcription system, calculating attack-sustain-decay-release times, which improves the synthesis process. Attention is also paid to the extraction of timbre and wavetable synthesis.
Download Real-Time Separation Of Transient Information In Musical Audio Using Multiresolution Analysis Techniques
Whilst musical transients are generally acknowledged as holding much of the perceptual information within musical tones, most research in sound analysis and synthesis tends to focus on the steady state components of signals. A method is presented which separates the noisy transient information from the slowly time varying steady state components of musical audio. Improvements of using adaptive thresholding, and multiresolution analysis methods are then illustrated. It is shown that by analyzing the resulting transient information only, current onset detection algorithms can be improved considerably, especially for those instruments with noisy attack information, such as plucked or struck strings. The idea is then applied to audio processing techniques to enhance or decrease the strength of note attack information. Finally, the transient extraction algorithm (TSS) is applied to time-scaling implementation, where the transient and noise information is analyzed so that only steady state regions are stretched, yielding considerably improved results.
Download Digital Audio Effects in the Wavelet Domain
Audio signals are often stored or transmitted in a compressed representation, which can pose a problem if there is a requirement to perform signal processing; it is likely it will be necessary to convert the signal back to a time domain representation, process, and then re-transform. This is timeconsuming and computationally intensive; it is potentially more efficient to apply signal processing while the signal remains in the transform domain. We have implemented a scheme whereby linear processing of the traditional type often instinctively understood by those working in the audio field may be applied to signals stored in a wavelet domain representation. Results are presented which demonstrate that the method produces the same output – to within the limits of machine precision – as timedomain processing, for less computational effort than would be required for the full explicit process through the time domain and back again. The potential benefits for linear effects processing (for example, EQ and sample-level delays and echoes) and also for non-linear processing such as dynamics processing, will be introduced and discussed.
Download A Hybrid Approach to Musical Note Onset Detection
Common problems with current methods of musical note onset detection are detection of fast passages of musical audio, detection of all onsets within a passage with a strong dynamic range and detection of onsets of varying types, such as multi-instrumental music. We present a method that uses a subband decomposition approach to onset detection. An energy-based detector is used on the upper subbands to detect strong transient events. This yields precision in the time resolution of the onsets, but does not detect softer or weaker onsets. A frequency based distance measure is formulated for use with the lower subbands, improving detection accuracy of softer onsets. We also present a method for improving the detection function, by using a smoothed difference metric. Finally, we show that the detection threshold may be set automatically from analysis of the statistics of the detection function, with results comparable in most places to manual setting of thresholds.
Download Automatic Polyphonic Piano Note Extraction Using Fuzzy Logic in a Blackboard System
This paper presents a piano transcription system that transforms audio into MIDI format. Human knowledge and psychoacoustic models are implemented in a blackboard architecture, which allows the adding of knowledge with a top-down approach. The analysis is adapted to the information acquired. This technique is referred to as a prediction-driven approach, and it attempts to simulate the adaptation and prediction process taking place in human auditory perception. In this paper we describe the implementation of Polyphonic Note Recognition using a Fuzzy Inference System (FIS) as part of the Knowledge sources in a Blackboard system. The performance of the transcription system shows how polyphonic music transcription is still an unsolved problem, with a success of 45% according to the Dixon formula. However if we consider only the transcribed notes the success increases to 74%. Moreover, the results obtained in the paper presented in [1], show how the transcription can be used with success in a retrieval system, encouraging the authors to develop this technique for more accurate transcription results.
Download Complex domain onset detection for musical signals
We present a novel method for onset detection in musical signals. It improves over previous energy-based and phase-based approaches by combining both types of information in the complex domain. It generates a detection function that is sharp at the position of onsets and smooth everywhere else. Results on a handlabelled data-set show that high detection rates can be achieved at very low error rates. The approach is more robust than its predecessors both theoretically and practically.
Download Matching live sources with physical models
This paper investigates the use of a physical model template database as the parameter basis for a MPEG-4 Structured Audio (MP4-SA) codec. During analysis, the codec attempts to match the closest corresponding instrument in the database. In this paper, we emphasize the mechanism enabling this match. We give an overview of the final front end, including the pitch detection stage, and remaining problems are discussed. A draft implementation, written in the Python language is described.
Download A comparison Between Fixed and Multiresolution Analysis for Onset Detection in Musical Signals
A study is presented for the use of multiresolution analysis-based onset detection in the complex domain. It shows that using variable time-resolution across frequency bands generates sharper detection functions for higher bands and more accurate detection functions for lower bands. The resulting method improves the localisation of onsets on fixed-resolution schemes, by favouring the increased time precision of higher subbands during the combination of results.
Download Digital Audio Effects Applied Directly on a DSD Bitstream
Digital audio effects are typically implemented on 16 or 24 bit signals sampled at 44.1 kHz. Yet high quality audio is often encoded in a one-bit, highly oversampled format, such as DSD. Processing of a bitstream, and the application of audio effects on a bitstream, requires special care and modification of existing methods. However, it has strong advantages due to the high quality phase information and the elimination of multiple decimators and interpolators in the recording and playback process. We present several methods by which audio effects can be applied directly on a bitstream. We also discuss the modifications that need to be made to existing methods for them to be properly applied to DSD audio. Methods are presented through the use of block diagrams, and results are reported. Keywords: Sigma Delta Modulation, SACD, DSD, Digital Audio Effects, Bitstream Signal Processing